BayCis: A Bayesian Hierarchical HMM for Cis-Regulatory Module Decoding in Metazoan Genomes
نویسندگان
چکیده
The transcriptional regulatory sequences in metazoan genomes often consist of multiple cis-regulatory modules (CRMs). Each CRM contains locally enriched occurrences of binding sites (motifs) for a certain array of regulatory proteins, capable of integrating, amplifying or attenuating multiple regulatory signals via combinatorial interaction with these proteins. The architecture of CRM organizations is reminiscent of the grammatical rules underlying a natural language, and presents a particular challenge to computational motif and CRM identification in metazoan genomes. In this paper, we present BayCis, a Bayesian hierarchical HMM that attempts to capture the stochastic syntactic rules of CRM organization. Under the BayCis model, all candidate sites are evaluated based on a posterior probability measure that takes into consideration their similarity to known BSs, their contrasts against local genomic context, their firstorder dependencies on upstream sequence elements, as well as priors reflecting general knowledge of CRM structure. We compare our approach to five existing methods for the discovery of CRMs, and demonstrate competitive or superior prediction results evaluated against experimentally based annotations on a comprehensive selection of Drosophila regulatory regions. The software, database and Supplementary Materials will be available at http://www.sailing.cs. cmu.edu/baycis.
منابع مشابه
Decoding cis-regulatory DNAs in the Drosophila genome.
Cis-regulatory DNAs control the timing and sites of gene expression during metazoan development. Changes in gene expression are responsible for the morphological diversification of metazoan body plans. However, traditional methods for the identification and characterization of cis-regulatory DNAs are tedious. During the past year, computational methods have been used to identify novel cis-DNAs ...
متن کاملA probabilistic method to detect regulatory modules
MOTIVATION The discovery of cis-regulatory modules in metazoan genomes is crucial for understanding the connection between genes and organism diversity. RESULTS We develop a computational method that uses Hidden Markov Models and an Expectation Maximization algorithm to detect such modules, given the weight matrices of a set of transcription factors known to work together. Two novel features ...
متن کاملCisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling.
The regulatory information for a eukaryotic gene is encoded in cis-regulatory modules. The binding sites for a set of interacting transcription factors have the tendency to colocalize to the same modules. Current de novo motif discovery methods do not take advantage of this knowledge. We propose a hierarchical mixture approach to model the cis-regulatory module structure. Based on the model, a ...
متن کاملDecoding Noncoding Regulatory DNAs in Metazoan Genomes
The recent revelation that the human genome contains only ~30,000 genes underscores the importance of gene regulation in generating organismal diversity. Cis-regulatory DNAs, or enhancers, are short stretches of DNA--300 bp to 1,000 bp in length--that control gene expression. This DNA accounts for a substantial fraction of metazoan genomes, but is largely invisible. It cannot be identified by s...
متن کاملhiHMM: Bayesian non-parametric joint inference of chromatin state maps
MOTIVATION Genome-wide mapping of chromatin states is essential for defining regulatory elements and inferring their activities in eukaryotic genomes. A number of hidden Markov model (HMM)-based methods have been developed to infer chromatin state maps from genome-wide histone modification data for an individual genome. To perform a principled comparison of evolutionarily distant epigenomes, we...
متن کامل